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INTRODUCTION 

Tumor  growth  is  associated  with  the  expression  of  mutated  gene 
products,  inappropriate  gene  expression,  and  the  breakdown  of 
tissue  architecture,  leading  to  the  exposure  and  release  into  the 
peripheral  circulation  of  sequestered  antigens  (1,2) .  Whether 
these  circulating,  mutated  or  newly  displayed  tumor-associated 
antigens  elicit  an  autologous  humoral  immune  response  in  the  breast 
tumor  patient  is  of  vital  interest.  Isolation,  identification  and 
characterization  of  novel  breast  tumor  associated  autoantigens 
might  yield  new  insights  into  the  disease  process,  and  moreover, 
may  be  developed  into  diagnostic  screening  tests  and  potential 
targets  for  immunotherapy. 

The  screening  of  cDNA  expression  libraries  with  autologous  patient 
serum  is  a  powerful  technique,  which  has  been  used  successfully  for 
the  identification  of  autoimmune  disease  antigens  (3)  ,  and  which  we 
have  adapted  for  the  identification  of  autoantigens  in  cDNA 
libraries  made  from  breast  tumor  mRNA.  After  screening  cDNA 
libraries,  derived  from  primary  ductal  breast  carcinomas  with 
autologous  patient  serum,  we  have  detected  and  isolated  three 
immunoreactive  cDNA  clones,  all  three  of  which  are  newly  discovered 
gene  products.  The  first  autoantigen  isolate  Ngp  1  has  been 
characterized  and  is  a  nucleolar  GTP-binding  protein  which  appears 
to  be  a  vital  component  of  the  pre-mRNA  processing  machinery.  The 
predicted  amino  acid  sequence  of  the  second  clone  (tentatively 
named  LM04)  contains  two  LIM  domain  motifs  and  bears  a  60%  homology 
in  this  region  to  a  known  oncogene,  LMOl.  Our  studies  have 
identified  novel  proteins  that  appear  to  play  vital  roles  in  the 
regulation  of  cellular  growth,  and  may  help  in  the  understanding  of 
normal  cellular  proliferation  as  well  as  malignancy. 
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BODY 

Most  of  our  effort  for  the  past  year  has  focused  on  characterizing 
and  defining  the  role  of  our  second  breast  tumor  autoantigen 
isolate  (tentatively  named  LM04)  ,  which  is  a  newly  discovered  gene. 
We  chose  to  focus  on  LM04  because  we  found  it  to  be  related  to  a 
group  of  known  oncogenes,  and  our  observations  indicate  that  it  is 
an  important  regulatory  element  in  cellular  growth  and 
differentiation  and  possibly  relevant  to  breast  cancer.  Work  is 
continuing  on  our  first  autoantigen  isolate  Ngp-1,  the  nucleolar 
GTP-binding  protein  (4) ,  to  identify  other  proteins  that  interact 
with  it  using  the  yeast  two-hybrid  vector  system  (5) .  In  a  study 
characterizing  the  networks  of  interactions  between  yeast  proteins 
(6)  it  was  reported  that  the  yeast  homologue  of  Ngp-1  interacted 
with  six  other  yeast  proteins,  three  of  which  are  known  pre-mRNA 
splicing  factors.  Furthermore,  the  Ngp-1  homologue  was  found  to 
occupy  a  central  role  in  this  regulatory  mechanism,  and  that 
disruption  of  the  Ngp-1  open  reading  frame  (ORF)  was  lethal  to  the 
organism.  We  have  not  been  able  to  further  characterize  the  third 
autoantigenic  clone  (Auag3)  which,  with  the  exception  of  expressed 
sequence  tags  in  the  databases,  shows  no  homology  to  any  known 
gene . 

The  5'  end  600  bases  of  the  LM04  sequence  proved  to  be  highly  GC 
rich  (74%)  ,  with  some  stretches  exceeding  90%.  Formamide  containing 
sequencing  gels  had  to  be  used  to  obtain  an  accurate  sequence.  The 
high  GC  content  of  the  5'  portion  of  the  cDNA  explains  the  under¬ 
representation  of  full  length  clones  in  cDNA  libraries,  since  it 
would  impart  a  high  degree  of  secondary  structure,  interfering  with 
reverse  transcription.  Standard  PCR  reactions  were  also  ineffective 
across  this  region,  and  special  formulations  for  melting  high  GC 
DNA  had  to  be  used. 

Analysis  of  the  complete  LM04  cDNA  sequence  revealed  an  open 
reading  frame  from  nt  781  to  1278,  with  an  ATG  codon  in  the 
preferred  configuration  (7)  with  an  A  in  position  -3,  and  a  G  in 
position  +4  located  at  the  start  of  this  open  reading  frame.  The 
amino  acid  sequence  predicted  by  this  open  reading  frame  contains 
two  tandem  LIM  domain  motifs,  which  conform  to  the  consensus 
sequence  of  all  known  LIM  domains  (8)  and  occupy  almost  the  entire 
length  of  the  165  amino  acid  sequence  translation  product.  LIM 
domains  are  found  in  a  variety  of  proteins  and  describe  a  cysteine- 
rich,  zinc-binding  motif,  which  interacts  with  other  proteins. 
Computer  homology  searches  of  the  nucleic  acid  data  bases  detected 
a  62%  identity  between  the  region  encoding  the  two  tandem  LIM 
domains,  with  the  analogous  region  of  LMOl,  a  putative  oncogene 
associated  with  a  chromosomal  breakpoint  in  a  subset  of  T-cell 
leukemias  (9,10) .  The  size  of  the  predicted  translation  product  of 
our  isolate  (165  bp)  ,  is  similar  to  the  LMO  proteins:  LMOl  (160  aa) 
and  LM02  (158  aa)  (9,10).  In  view  of  these  similarities,  we 
tentatively  named  the  gene  of  our  cDNA  isolate,  LM04.  At  the  amino 
acid  level,  the  identity  within  the  LIM  domains  of  LMOl  and  LM04  is 
55%,  and  spacing  of  the  amino  acids  making  up  the  LIM  domains  is 
identical.  The  LIM  domain  sequences  are  so  highly  conserved  that 
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the  identity  of  LM04  protein  to  the  LM01  homologue  of  Drosophila  is 
just  as  extensive  (54%) .  A  number  of  the  expressed  sequence  tags 
found  in  homology  searches  were  from  mouse  and  rat  cDNA  libraries 
and  showed  identity  in  a  range  of  90  to  98%,  indicating  that  LM04 
is  highly  conserved  in  evolution.  Although  the  long  untranslated  5' 
end  of  LM04  cDNA  shows  a  slight  homology  to  some  other  genes  with 
GC  rich  5'  end  regions,  the  sequence  is  unique,  as  is  the  long  3' 
end.  LM04  does  not  seem  to  have  any  other  closely  related  genes, 
since  in  our  efforts  to  obtain  full  length  clones,  after  extensive 
hybridization  screening  of  different  cDNA  libraries,  we  failed  to 
isolate  any  related  cDNAs . 

Northern  blot  analysis  to  assess  tissue  distribution  of  this  gene 
product  revealed  it  to  be  present  in  most  tissues  analyzed,  with 
highest  expression  in  brain,  skeletal  muscle,  testis  and  ovary; 
with  little  or  no  expression  in  liver,  kidney  and  pancreas.  Two 
bands  of  approximately  2.1  and  1.9  kb  could  be  discerned  in  most 
tissue  samples  however,  it  is  the  larger  2 . 1  kb  band  which  is  most 
prominent  in  the  normal  tissue  samples,  while  in  breast  tumor  mRNA 
the  smaller  1.9  kb  band  is  most  prominent.  This  pattern  of 
expression  was  observed  in  all  breast  tumor  mRNA  samples  analyzed. 
The  same  two  bands  hybridized  with  either  a  600  bp  5'  end  probe,  or 
a  probe  containing  open  reading  frame  and  3'  end  sequences.  Since 
breast  tumors  are  a  complex  mixture  of  different  cell  types 
(stromal  fibroblasts,  infiltrating  lymphocytes  and  transformed 
breast  epithelial  cells) ,  the  exact  source  of  the  LM04  transcripts 
in  breast  tumors  remains  to  be  determined. 

A  variant  LM04  cDNA  clone,  with  a  112  base  pair  deletion  in  the  5' 
region  (nt  81  -  192),  was  isolated  from  different  cDNA  libraries. 
This  clone  probably  represents  the  smaller  band  observed  in 
northern  blots.  The  deletion  alters  some  possible  short  open 
reading  frames  in  the  5'  untranslated  region,  and  its  role  in  the 
expression  of  LM04  protein  is  yet  to  be  determined.  Short  open 
reading  frames  in  the  5'  untranslated  region  have  been  detected  in 
other  tightly  controlled  genes  and  have  been  shown  to  suppress 
translation. 

The  extremely  long  5'  end  of  LM04  cDNA  (780  bp)  is  the  GC  rich 
region.  A  long  GC  rich,  structured  5' -leader  sequence  is 
characteristic  of  transcripts  encoding  oncoproteins,  growth 
factors,  transcription  factors,  and  other  regulatory  proteins  - 
that  seem  to  be  designed  to  be  translated  poorly  (11) .  Inhibition 
at  the  translational  level  seems  to  be  a  component  of  gene 
regulation  for  genes  which  need  to  be  tightly  regulated.  Another 
feature  of  the  sequence  of  LM04  cDNA  is  the  presence  of  multiple 
ATTT  motifs  in  the  3'  end,  which  have  also  been  observed  in  the  3' 
untranslated  region  of  numerous  lymphokine,  cytokine,  and  proto¬ 
oncogene  mRNAs .  It  has  been  proposed  that  such  ATTT  motifs  are 
involved  in  the  selective  degradation  of  transiently  expressed 
messengers  (12) . 

The  amino  terminal  amino  acid  sequences,  immediately  preceding  the 
LIM  domains  of  both  LMOl  and  LM02  have  been  shown  to  be 
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transactivation  domains  (13)  .  It  has  been  shown  that  the  four 
proline  residues  within  the  19  amino  acid  long  activation  domain  of 
LM02  played  an  important  role  in  conferring  full  transactivation 
activity  to  this  domain  (13) .  The  amino  terminal  sequence  of  the 
predicted  LM04  protein  (MVNPGSSSQPPPVTAGSLSLSW) ,  although  not 
homologous  to  the  analogous  sequences  of  LM01  or  LM02 ,  is  also 
proline  rich  and  may  also  be  an  activation  domain. 

As  a  result  of  our  major  effort  during  the  past  year,  we  have 
obtained  further  evidence  for  the  role  and  importance  of  LM04  in 
cellular  growth  regulation  by  using  the  yeast  two  hybrid  screen 
with  an  LM04  LIM  domain  construct  in  a  binding  domain  bait  plasmid. 
We  have  identified  five  gene  products  which  were  isolated  numerous 
times  as  activation  domain  (AD)  co-transf ormants  with  the  LM04  bait 
plasmid.  We  are  now  in  the  process  of  verifying  the  authenticity 
of  these  two  hybrid  interactions  by  biochemical  binding  assays, 
immunoassays,  as  well  as  individual  yeast  co- transformations  with 
positive  AD  clones  containing  frameshift  mutations.  All  five 
putative  LM04-binding  gene  products  have  been  sequenced  and 
identified: 

1.  Suppressin  -  An  uncharacterized  gene  product,  identified  as  a 
63  kDa  inhibitor  of  cell  proliferation  in  the  Genbank  database  (# 
U59659) . 

2.  Kinesin-2  -  (14)  A  member  of  a  superfamily  of  motor  proteins 
which  are  implicated  in  mechanisms  of  mitosis  or  meiosis,  and  is 
markedly  upregulated  in  tumor  cells  after  retinoid  treatment. 

3.  SUPT6H  -  (15)  An  extremely  conserved  mammalian  nuclear  protein 
that  regulates  transcription  through  establishment  or  maintenance 
of  chromatin  structure,  and  appears  to  play  a  significant  role  in 
the  human  estrogen  receptor  signal  transduction  pathway. 

4.  NCS-1  (16)  A  calcium  sensor  protein  involved  in  the 
phosphorylation  of  components  of  the  signal  transduction  machinery. 

5.  eIF3  An  uncharacterized  human  translation  initiation  factor. 

Although  we  have  not  yet  verified  by  biochemical  and  immunological 
means  the  positive  interactions  of  these  5  gene  products  with  LM04 , 
all  positive  clones  were  found  to  have  open  reading  frames  in- frame 
with  the  AD  coding  region,  supporting  the  interpretation  that  the 
two  hybrid  interactions  represent  real  affinities.  In  addition, 
positive  clones  from  each  gene  product  were  not  all  identical, 
however  they  all  had  portions  of  their  ORFs  in  frame  with  the  AD 
coding  region.  There  appears  to  be  a  common  thread  about  the 
function  of  each  of  these  proteins,  which  further  supports  the 
likelihood  that  LM04  is  a  vital  mediator  of  differentiation  and 
development,  and  potentially  relevant  to  malignancy.  neither  of 
these  five  gene  products  has  been  fully  characterized,  nor  have 
they  been  implicated  in  any  known  transcription  or  other  regulatory 
complexes . 
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CONCLUSIONS 

We  have  identified  and  characterized  two  new  gene  products  ( Ngp-1 
and  LM04)  ,  both  of  which  appear  to  play  vital  roles  in  cellular 
growth  and  differentiation.  LM04  appears  to  be  especially  relevant 
to  cellular  growth  control  because  of  its  interaction  with  other 
gene  products  involved  in  gene  regulation  and  signal  transduction. 
Our  observation  that  the  expression  of  LM04  transcripts  in  breast 
tumors  differs  from  that  in  other  tissues,  and  that  LM04  interacts 
with  a  protein  involved  in  estrogen  receptor  signal  transduction, 
hint  at  a  possible  role  of  LM04  in  breast  cancer,  and  merits 
further  investigation.  Structural  features  of  the  LM04  cDNA 
sequence  (a  long  GC-rich  structured  5'  end,  the  presence  of  mRNA 
destabilizing  motifs  in  the  3'  end  and  a  predicted  amino  acid 
sequence  which  contains  two  LIM  domain  motifs  with  a  partial 
homology  to  a  known  oncogene)  all  predict  that  this  gene  plays  a 
vital  role  in  the  life  of  the  organism. 
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